Create your own markdown and start committing!

library(ggplot2)
library(sf)
## Linking to GEOS 3.13.0, GDAL 3.8.5, PROJ 9.5.1; sf_use_s2() is TRUE
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ lubridate 1.9.4     ✔ tibble    3.3.0
## ✔ purrr     1.1.0     ✔ tidyr     1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
  1. Let’s also grab some data here. This is spatial point dataset that I have collected as part of a project in the Open Spaces and Moutain Parks of Boulder Colorado. It consist of the points where people have taken pictures using Flickr and Panramio. We have also collected several spatial varibles that might explain why individuals might be taking photographs at these points and all other points in park. We will import the data as a sf spatial dataset.
#getwd()

boulder <- st_read("/Users/paigelund/Desktop/EAS_548/advanced_geovisualization_week_two/BoulderSocialMedia/BoulderSocialMedia.shp")
## Reading layer `BoulderSocialMedia' from data source 
##   `/Users/paigelund/Desktop/EAS_548/advanced_geovisualization_week_two/BoulderSocialMedia/BoulderSocialMedia.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 55519 features and 12 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -788775 ymin: 1917813 xmax: -780555 ymax: 1930053
## Projected CRS: NAD_1983_Albers
boulder
## Simple feature collection with 55519 features and 12 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -788775 ymin: 1917813 xmax: -780555 ymax: 1930053
## Projected CRS: NAD_1983_Albers
## First 10 features:
##            id     DB   extent Climb_dist TrailH_Dis NatMrk_Dis Trails_dis
## 1  6517284333 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 2  6517281191 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 3  6517278961 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 4  6517276295 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 5  6517274727 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 6  6517272539 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 7  6517270109 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 8  6516904527 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 9  6516902971 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
## 10 6516900761 Flickr 421678.2   1973.108   2368.567   2451.633   49.73422
##    Bike_dis PrarDg_Dis PT_Elev Hydro_dis Street_dis                geometry
## 1  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 2  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 3  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 4  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 5  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 6  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 7  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 8  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 9  1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)
## 10 1437.134   1942.125    2064   1359.75   193.9165 POINT (-786099 1929916)

This Here are the details of data:

Variable Description
DB indicates whether the point is a social media location (Flickr or Panramio) or a point in the park
extent extent that can be viewed at each point estimated through viewshed analysis
Climb_dist distance to nearest climbing wall
TrailH_Dis distance to hiking trails
NatMrk_Dis distance to natural landmark
Trails_dis distance to walking trails
Bike_dis distance to biking trails
PrarDg_Dis distance to prairie dog mounds
PT_Elev Elevation
Hydro_dis distance to lakes, rivers and creeks
Street_dis distance to streets and parking lots
  1. We can plot these variables using ggplot2. We define the sf data using the geom_sf function. The different arguments control the object attributes(this can be points, lines or polygons). For example, fill= control the color of object outline. alpha = controls the opacity of the object. The final argument is a complete theme, which controls the non-data display(e.g. neatlines, gradicule title). More details can be found regarding these [themes] here(https://ggplot2.tidyverse.org/reference/ggtheme.html). Here we use theme_bw, which is the black and white theme. You can try other themes to explore the different options.
ggplot() +
    geom_sf(data =boulder,
    fill = NA, alpha = .2) +
    theme_bw()

  1. At the moment, the projection is a bit weird. Let’s project the data using an appropriate projection for Colorado. Use the epsg.io website for choosing the an appropriate projection
boulder = st_transform(boulder, 26753) 
ggplot() +
    geom_sf(data =boulder,
    fill = NA, alpha = .2) +
    theme_bw()

  1. Now we will explore different methods for visualizing this data. We will add ‘Gradient colour scales’ in ggplot2. Here is the documentation of these options https://ggplot2.tidyverse.org/reference/scale_gradient.html.
ggplot() +
    geom_sf(data =boulder, aes(color=PT_Elev),
    fill = NA, alpha = .2) +
    theme_bw()

  1. ggplot2 has several gradient colour scale options. The details can be found here.
ggplot() +
    geom_sf(data =boulder, aes(color=PT_Elev),
    fill = NA, alpha = .2) +
  scale_colour_gradientn(colours = terrain.colors(10)) +  
  theme_bw()

  1. Let’s look at the locations above 2200 meters. For this we will need to use the ifelse() function. The function basically means if the first argument is true (PT_Elev >= 2200), the elevation is greater than 2200 meter, then print the first varible: TRUE; if not true, print the second varible: FALSE. We use the mutate fucntion to make a new variable in our boulder dataframe. We then use ggplot to plot these locations.
#library(dplyer)
boulder %>%
    mutate(high_elev = ifelse(PT_Elev >= 2200, TRUE, FALSE))%>% 
ggplot() +
  geom_sf(aes(color=high_elev),
    fill = NA, alpha = .2)  +  
  theme_bw()

  1. We can also plot different charts using ggplot. Let’s compare the distance from roads and social media photographs. Here we filter() to analyze social media only. We use a box plot to compare mean distance of these photographs from the nearest road. What does this test?
boulder %>%
  filter(DB ==  'Pano' | DB == 'Flickr') %>%
  ggplot(aes(x=DB, y=Street_dis)) + 
  geom_boxplot()

As you can see there is no significant relationship. The mean values and standard deviation is highly similar. There are numerous other tests and charts that you can use to investigate the relationship between locations of soical media photographs and other locations in the park.

Additional Geovis tools

We are also going to learn about two new packages that might be helpful for your data science approach. We will learn about the library(viridis), which provides color palettes that are interpretable for visually impaired.

The color scale

The package viridis contains four color scales: “Viridis”, the primary choice, and three alternatives with similar properties, “magma”, “plasma”, and “inferno”.

library(sf)
library(ggspatial)
library(viridis)
## Loading required package: viridisLite
## the function gives the hexadecimal colors 
## the interger give the numbers of colors
magma(10)
##  [1] "#000004FF" "#180F3EFF" "#451077FF" "#721F81FF" "#9F2F7FFF" "#CD4071FF"
##  [7] "#F1605DFF" "#FD9567FF" "#FEC98DFF" "#FCFDBFFF"
boulder <- st_read("/Users/paigelund/Desktop/EAS_548/advanced_geovisualization_week_two/BoulderSocialMedia/BoulderSocialMedia.shp")
## Reading layer `BoulderSocialMedia' from data source 
##   `/Users/paigelund/Desktop/EAS_548/advanced_geovisualization_week_two/BoulderSocialMedia/BoulderSocialMedia.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 55519 features and 12 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -788775 ymin: 1917813 xmax: -780555 ymax: 1930053
## Projected CRS: NAD_1983_Albers
ggplot() +
    geom_sf(data = boulder, aes(color=PT_Elev),
    fill = NA, alpha = .2) + 
    scale_colour_gradientn(colours = magma(10))

We can also plot discrete values.

summary(boulder$DB)
##    Length     Class      Mode 
##     55519 character character
p <- ggplot() +
  annotation_spatial(boulder) +
  layer_spatial(boulder, aes(col = DB))
p + scale_color_brewer(palette = "Dark2")

tmaps

Alternatively, we can use tmap a way to create maps using R

library(tmap)
## Add the data - these are specific to the vector or raster
tm_shape(boulder) + 
  ## which variable, is there a class interval, palette, and other options
  tm_symbols(col='PT_Elev', 
             style='quantile', 
             palette = 'YlOrRd',
             border.lwd = NA,
             size = 0.1)
## 
## ── tmap v3 code detected ───────────────────────────────────────────────────────
## [v3->v4] `symbols()`: instead of `style = "quantile"`, use fill.scale =
## `tm_scale_intervals()`.
## ℹ Migrate the argument(s) 'style', 'palette' (rename to 'values') to
##   'tm_scale_intervals(<HERE>)'
## [v3->v4] `symbols()`: use 'fill' for the fill color of polygons/symbols
## (instead of 'col'), and 'col' for the outlines (instead of 'border.col').
## [cols4all] color palettes: use palettes from the R package cols4all. Run
## `cols4all::c4a_gui()` to explore them. The old palette name "YlOrRd" is named
## "brewer.yl_or_rd"
## Multiple palettes called "yl_or_rd" found: "brewer.yl_or_rd", "matplotlib.yl_or_rd". The first one, "brewer.yl_or_rd", is returned.

It is really easy to add cartographic elements in tmap

## here we are using a simple dataset of the world 
# tmap_mode("plot")
data("World")
tm_shape(World) +
    tm_polygons("gdp_cap_est", style='quantile', legend.title = "GDP Per Capita Estimate")
## 
## ── tmap v3 code detected ───────────────────────────────────────────────────────
## [v3->v4] `tm_polygons()`: instead of `style = "quantile"`, use fill.scale =
## `tm_scale_intervals()`.
## ℹ Migrate the argument(s) 'style' to 'tm_scale_intervals(<HERE>)'
## [tm_polygons()] Argument `legend.title` unknown.
## [tip] Consider a suitable map projection, e.g. by adding `+ tm_crs("auto")`.

It is really easy to make an interactive map in tmap as well

## the view mode creates an interactive map
tmap_mode("view")
## ℹ tmap mode set to "view".
tm_shape(World) +
    tm_polygons("gdp_cap_est", style='quantile', legend.title = "GDP Per Capita Estimate")
## 
## ── tmap v3 code detected ───────────────────────────────────────────────────────
## [v3->v4] `tm_polygons()`: instead of `style = "quantile"`, use fill.scale =
## `tm_scale_intervals()`.
## ℹ Migrate the argument(s) 'style' to 'tm_scale_intervals(<HERE>)'[tm_polygons()] Argument `legend.title` unknown.

Advanced Week 1 Lab Assignment

In this week’s lab, you will make an open science markdown that documents your process of data analysis and geovisualization. We will be using git to aid in version control for the code. Your assignment is to use Knitr to develop a markdown document that shows your analysis of the Boulder data (you can also use your own data if you wish). Demonstrate how you did your analysis giving step-by-step instructions with the accompanying code.

Questions

  1. Discuss the advantages and challenges associated with an open data science approach. Provide an example based on this week’s reading. (1-2 paragraphs)

  2. Create a markdown document that showcases an analysis of this week’s data or any other dataset of your choice. Include descriptive text that explains your analysis, and incorporate figures and geovisualizations.Include 1 chart and 1 map. Structure and explain your analysis with text, headings, highlights, images and other markdown basics.

Bonus: Capture a screenshot of the history of your Git commits. Share your strategy for utilizing Git in your workflow.

Here are the evaluation criteria for the geovisualizations. Questions will be worth 30% of your grade, while the geovisualization and explanation will be worth 70%.